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Abstract 

In  this  paper  we  consider  the  problem  of  ranking  (partitioning)  k  populations  ac¬ 
cording  to  the  parameter  which  is  defined  as  functionals  of  the  distribution  functions 
on  the  underlying  populations.  We  obtain  minimax  rules  for  general  loss  functions, 
Bayes  rules  for  some  specific  loss  functions  and  propose  approximate  non-randomized 
minimax  rules.  We  also  derive  restricted  minimax  rules  for  selecting  a  subset  of  pop¬ 
ulations  which  are  better  than  a  control.  Some  nonparametric  “optimal”  tests  are 
derived  for  different  hypotheses  written  in  terms  of  the  parameter  as  a  functional  of 
the  underlying  distribution  function. 
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1  Introduction 

In  practice,  the  experimenter  is  often  faced  with  the  problem  of  comparing  k  populations,  for 

example,  comparing  k  different  treatments  in  clinical  trials,  or  comparing  k  different  varieties 
‘This  research  was  supported  in  part  by  the  Office  of  Naval  Research  Contract  N00014-88-K-017  and 
NSF  Grants  DMS-86066964,  DMS-8702620  at  Purdue  University. 
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of  grain  in  an  agricultural  experiment.  The  classical  tests  of  homogeneity  never  answer  the 
question  “what  next?”  if  the  hypothesis  is  rejected.  Mosteller  (1948)  and  Paulson  (1949), 
and  Bahadur  (1950)  were  among  the  first  research  workers  to  recognize  the  inadequacy  of 
such  tests  for  homogeneity  and  to  reformulate  the  problem  as  a  multiple  decision  problem 
concerned  with  the  ranking  and  selection  of  k  populations. 

One  approach  pioneered  by  Bechhofer  (1954)  has  been  to  allow  the  experimenter  to 
select  one  population  which  is  guaranteed  to  be  of  interest  to  him  with  a  fixed  probability 
P*,  whenever  the  unknown  parameters  lie  outside  some  subset  of  the  parameter  space.  This 
has  been  termed  as  the  indifference  zone  approach.  In  contrast  to  the  indifference  zone 
approach,  Gupta  (1956)  proposed  a  formulation  in  which  the  experimenter  obtains  a  subset 
of  k  populations  for  which  there  is  a  fixed  minimum  probability  P*,  over  the  entire  parameter 
space,  that  the  population  of  interest  is  selected.  For  an  extensive  review  of  the  subset 
selection  methodology  see  Gupta  and  Panchapakesen  (1979)  and  Gupta  and  Panchapakesen 
(1986). 

In  this  paper  we  consider  a  decision  theoretic  formulation  of  the  ranking  problem  in 
the  nonparametric  setup.  Let  the  distribution  function  F  on  Rp  be  characterized  by  the 
functional  9(F)  =  f  gdF ,  where  g  is  a  known  real- valued  bounded  function  on  Rp  and 
9  —  6(F)  is  the  parameter  of  interest. 

Consider  the  following  examples. 

(1)  SELECTING  THE  BEST: 

Company  A  produces  a  product  whose  observable  quality  is  represented  by  a  random 
variable  Y.  Company  B  has  discovered  •  w  products  of  the  same  “type”  and  wants  to 
select  one  of  those  k  products  which  will  beat  the  product  of  company  A  in  the  market.  Let 

us  suppose  X(i)  represents  the  quality  of  the  i  th  product  of  company  B  for  i  =  1 . k. 

A  customer  will  select  the  product  of  the  company  A  instead  of  a  specified  ith  product  of 
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company  B  if  Y  is  grater  than  X(i).  Hence  in  this  problem  the  parameter  of  interest  is 
6(i)  —  Pr(X(i )  >  Y)  and  company  B  wants  to  select  the  product  for  which  0(i)  is  largest. 
Here  g  is  the  distribution  function  of  Y. 

The  function  Pr(X  <  Y)  is  of  considerable  importance  in  many  practical  situations,  such 
as  clinical  trials,  genetics,  and  reliability.  For  the  estimation  of  the  parameter  Pr(X  <  Y) 
and  for  related  references  see  Brownie  (1988),  Simnoff,  Hochber  and  Reiser  (1986).  In  Section 
3  “optimal”  non-parametric  tests  for  the  various  hypothesis  for  the  parameter  9  =  Pr(X  < 
Y)  are  derived. 

(2)  REGRESSION: 

Let  X  =  (A"i,  .Y2, . . .  ,XP)  be  a  p  dimensional  random  vector  which  has  the  distribution 
function  F.  We  want  to  test  whether  xx  is  well  approximated  by  h(xi,x 3,...xp),  where  k 
is  a  known  real- valued  function  on  Define  6(F)  =  /  d(X\  —  h(X2,  X$, . . . ,  Xp)dF, 

where  d  is  an  appropriate  non-negative  function  on  R.  In  this  situation  g(x )  =  d(xx  — 
h{x 2,  Xj, . . . ,  xp)).  We  may  want  to  test 

H0  :  0(F)  <  90  vs  Hx  :  0(F)  >  0o, 

where  0o  is  a  known  constant. 

(3)  SELECTING  A  SUBSET  OF  THE  POPULATIONS  CONTAINING  A  POPULATION 
BETTER  THAN  THE  CONTROL: 

Let  x(a)  be  the  a  th  quantile  of  the  control.  There  are  k  populations,  n^^,. . . .  n<.-. 
The  population  is  associated  with  the  distribution  function  F,  on  R  ,  for  i  =  1.2 
We  say  the  population  n,  is 

“good”  if  f  dFi  >  a 
J  —  OO 

and 

“bad”  if  f  dF,  >  a  —  8. 

J  —  OO 
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In  this  problem  g(x)  =  /(_ oo,xa)(x)- 

It  is  important  to  consider  a  non-parametric  model,  since  often  in  practice,  especially 
for  the  new  treatments,  there  is  not  much  information  which  could  lead  us  to  assume  some 
parametric  model. 

In  the  next  section,  we  will  derive  a  minimax  procedure  for  the  selection  and  ranking 
problem,  we  also  obtain  a  restricted  minimax  procedure  for  the  problem  when  the  populations 
are  compared  with  a  control. 

Our  procedures,  however,  are  randomized.  We  feel  that  the  randomization  is  unavoidable 
in  the  present  situation,  since  as  is  known,  certain  properties  of  the  risk  function  can  be 
improved  only  by  using  randomization.  In  some  examples  we  will  also  prove  that  these 
procedures  are  unique  and  admissible.  In  Section  4  we  will  derive  some  “optimal”  non- 
parametric  tests. 

Most  of  the  existing  results  on  non-parametric  models,  in  general  are  asymptotic.  The 
finite  sample  results,  which  are  presented  here  may  be  of  use  to  check  the  optimality  of  the 
existing  procedures  (tests)  or  for  proving  optimality  of  new  tests. 

It  should  be  pointed  out  that  results  presented  here  do  not  apply  to  the  problem  of 
selecting  the  population  with  the  largest  a  th  quantile  (  or  largest  location  parameter).  Also 
these  results  do  not  apply  to  the  problem  of  selecting  a  subset  of  the  population  which 
contains  the  population  with  largest  a  th  quantile  (or  location  parameter).  Considerable 
amount  of  work  has  been  done  on  those  kinds  of  problems.  See  Barlow  and  Gupta  (1969), 
Gupta  and  McDonald  (1970),  Gupta  and  Huang  (1974),  Rizvi  and  Sobel  (1967),  Sobel 
(1967).  An  extensive  review  of  non-parametric  selection  and  ranking  procedures  is  in  Desu 
and  Bristol  (1986). 
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2  Selection  And  Ranking 

There  are  k  populations  IIj,  II2  . . . ,  ITjt-  The  population  II,  is  associated  with  the  cumulative 
distribution  function  F,(.)  on  Rp,  for  i  =  1,2, The  population  IT,  is  characterized  by 
the  real-valued  function, 

0(Fi)  =  [  g{x)dFi{x)  ; 

JRp 

where  g  is  a  known,  real-valued  bounded  function  on  Rp. 

Define  =  0(F{)  for  i  =  1, 2, . . . ,  k  and 

F  =  (Fi,F2,  . . . ,  Ffc),  6  =  6(F)  =  (0^02,..., 9k)- 

Let 

T  =  {{Fi,  F2 , . . . ,  Fk)  :  F{  is  distribution  on  Rp  } 

and 

0  =  {(0(Fi),O(F2),  . . .  ,0(Fk)  :  Fi  is  distribution  on  Rp  }  . 

Let  Xu ,  X{2, . . . ,  Xtn  be  the  n  independent  random  vectors  from  population  II,. 

Problem  (I)  General  Ranking  Problem: 

On  the  basis  of  a  set  of  observations  we  wish  to  partition  the  set  of  the  co-ordinate 
values  of  the  k  dimensional  parameter  vector  0  =  (#i,  62,  ■  ■  •  ,0k)'  in  to  r  disjoint  subsets,  say 
S\ ,  S2, ....  5V  ,  such  that  S 1  contains  the  tx  largest  components  of  0 ,  S2  contains  next  <2 
largest  components  of  0  and  ...  ,  ST  contains  the  tT  smallest  components  of  0.  The  size  of 
each  subset  is  fixed  in  advance  and  Y7i=\  U  =  k- 

Let  the  the  action  space  A,  be  the  set  of  all  possible  partition  of  the  set  {1,2, _ k}  in  to  r 

subsets  S\,  S2, . . . ,  ST  of  size  t\,t2, . . . ,  tT ,  respectively.  For  a  €  A  let  a  =  (Sa,i,  Sa, 2, _ Sa,T). 

A  decision  rule  6  =  £(.)  , 

6{.)  =  {6{.,a)  :a€^}  ;  (1) 
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is  a  measurable  function  on  Rnpk,  such  that 

0  <  S(.,a)  <  1 

and 

53  <$(x,a)  =  1. 
a£A 

If  X  =  x  is  observed  then  the  decision  rule  6  takes  the  action  a  with  the  probability 
6(x,  a)  . 

Let  T>  be  the  class  of  all  decision  rules.  We  will  consider  the  loss  functions  which  are 
“invariant”,  “non- negative”  and  “monotone.”  This  type  of  loss  structure  is  considered  by 
several  authors,  for  example,  see  Eaton  (1967),  Gupta  and  Mieske  (1984).  Let  L(.. .)  .  a  real 
valued  measurable  function,  be  a  loss  function  on  0  x  A.  Hence  if  one  takes  action  a  and  if 
the  true  parameter  is  9  then  the  loss  is  L(9,a).  Formally  we  write  the  conditions  on  the  loss 
function  as: 

[1]  L(0,a)  >  0 

[2]  For  every  parmutation  7 r  on 

{1,2 . k}  L(ir{9)1Tr(a))  =  L{9,a)  Va  €  A 

[3]  Let  9 ,  >  9 j  and  a  =  (5i,  S?, . . . ,  5r),  a'  =  (S{,  S'2, . . . ,  S')  such  that,  for  r!  and  r2 
such  that  1  <  rx  <  r2  <  r  V  t  ^  rx  and  i  ±  r2  St  =  S't  and  5',  =  (5r,  -  {j})  U  {?} ),  5'2  = 
(Srj  -  {?})  U  {j}),  then, 

L(0,a)  <  L(9,a’); 

[4]  For  every  a  €  A  L(9,a)  is  a  continuous  function  of  9  . 
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The  risk  function  of  the  decision  rule  6  is  given  by 


Ri{F,8)  =  EfL(6(F),6) 

=  '£L(6(F),8)EF6(X,a). 

A  minimax  rule  will  be  derived  for  the  problem  described  above. 

Problem  (II)  Selecting  “good”  populations: 

We  describe  this  problem  as  in  Lehmann  (1961).  Let  there  be  a  fixed  value  90  and 
let  A  be  a  fixed  positive  real  number.  The  population  IT,  is  said  to  be  good  (positive)  if 
9i(F)  >  #o+ A  and  bad  if  9j(F)  <  6q.  We  wish  to  select  a  subset  of  the  populations  containing 
good  populations,  provided  there  exists  at  least  one  good  population. 

For  this  problem  we  will  consider  two  loss  functions,  one  will  guard  against  selecting  too 
many  bad  populations  and  the  other  one  will  make  sure  that  good  populations  are  being 
selected.  As  in  Lehmann  (1961)  the  following  criteria  will  be  used  for  measuring  how  well 
the  procedure  carries  out  the  task, 

(51)  The  expected  number  of  good  populations. 

(52)  The  expected  proportion  of  good  populations. 

(53)  The  probability  of  selecting  at  least  one  good  population,  provided  there  exists  one. 

(54)  The  probability  of  including  the  “best”  population  provided  it  is  “good”. 

The  following  criteria  are  considered  for  measuring  the  performance  of  the  procedure. 
(Rl)  The  number  of  bad  populations  in  the  selected  subset. 

(R2)  The  proportion  of  bad  population  in  the  selected  subset. 

For  a  subset  selection  procedure  8,  S(0(F),8)  is  given  by  (Si),  (S2),  (S3)  or  (S4)  and 

R(0(F),8)  is  given  by  (Rl)  or  (R2).  Let 

F'  =  |  F  :  9{Fi)  >  Oo  +  A  for  some  i,  1  <  i  <  k  j 

and  V,  be  the  class  of  all  procedures  for  selecting  a  subset  of  good  populations.  We  will 
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construct  a  restricted  minimax  procedure  for  this  problem,  that  is,  we  will  construct  a 
procedure  8„  G  T>,  which  minimizes  sup  R(0(F),8)  among  all  8  G  Va  for  which 

mf  S{6{F),6)>p, 

where  p  is  a  given  fixed  number. 

To  prove  the  main  results,  we  need  results  from  Eaton  (1967)  and  from  Lehmann  (1961). 
For  sake  of  completeness,  we  state  them. 

Theorem  2.1  ;  Let  the  random  variable  X,  have  density  p@,(x);t  =  1,2,. ..fc,  pe{x)  has 
monotone  likelihood  ratio  in  x ,  and  let  Ri(0,8)  be  as  defined  in  Problem  I. 

Let 


And  let 


Ba  —  {•£  .  xt,  x,'2  ^  ^  xtj  V  ij  G  Sa,j  Vj 1 } 

H(x)  =  {a  :  a  €  A  ;x  €  Ba}  . 


n(x)  =  number  of  elements  in  {a  :  a  G  ^(i)}  • 

Define 

8'(x,a)  =  — i—  if  aeH(x) 

n(x) 

=  0  otherwise. 


Then  the  rule  8 1  minimizes  supg  R\(0,8)  among  all  8  (zT>. 


(2) 

(3) 


Theorem  2.2  :  Let  X\,  X-i,Xj, . . . ,  Xk  be  the  independent  random  variables  with  probability 
densities  p#,  (x),  p#2(x), . . . ,  pgk(xk)  respectively.  Let  pe(x)  has  monotone  likelihood  ratio  in 
x.  Define  R(0,8)  and  S(0,8)  as  in  Problem  II,  where  8  is  a  subset  selection  procedure.  Let 
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8,  =  {f>u,---,f>ks),  where 


8is  =  1  if  Xi  >  c 

=  A0  ifX  =  c 

—  0  otherwise. 

Where  A0  and  c  are  determined  by  the  equation,  Eo0+&8 i*  =  p  ,  then  the  rule  6,  minimizes 
sup SR{9,8)  among  all  rules  8  €  T),  such  that 

inf  S(0,8)  >  p, 
sen'  v  ’ 

where, 

M  =  |  (^1,^2,  • . .  ,0k)  :  0i>0 o  +  A  for  some  i  j- 

The  supremum  of  R(9,8,)  is  attained  at  9  —  (90,  9o, . . . ,  $o)  and  the  infrimum  of  S{9,8,)  is 
attained  at  9  —  (90  +  8, 9o, . . . ,  9q). 

Let  Xi,Xz, . .  .Xk,  be  independent  binomial  random  variables  with  parameters 
(n,0i),(n,02),  ...,(n,0*)  respectively.  For  8S(.)  =  8,(x), 

sup R(9,8)  ~  n[P8o(X  >  c)  +  A  Pg0(X  =  c)]  (4) 

sen 

=  h(90,X,p)  say.  (5) 

Here  c  is  a  non-negative  real  number  and  A  in  (0, 1)  are  chosen  such  that, 

Pot+biXi  >  c)  4-  A  P»0+a(-Vi  =  c)  =  p.  (6) 

If  V i  is  a  sequence  of  binomial  random  variables  with  parameters  (n,p,),  and  as  i  — ►  oo.  p, 
converges  to  p0,  then  the  sequence  of  V,  converges  weakly  to  Vo,  we  note  that,  for  a  fixed  p. 
/i(0o,  A,p)  is  a  continuous  function  of  9q  and  A. 
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Making  transformation  on  g,  if  necessary 


9(*) 


g{x)  -  inf  g(x) 
sup  g(x)  -  inf  g(x) 


and  observing  that 

J [ag{x)  +  b }  dF(x)  =  a  0(F)  +  b  , 

without  any  loss  of  generality  we  assume  that 
0  <  g(x)  <  1  V  x  €  Rv  and  supg(x)  =  1,  inf^(x)  =  0. 

Lemma  2.1  :  If  5  £  XL,  a  subset  selection  procedure  for  Problem  II  and 


mf  'S(0(F),6)>p, 

then 

sup  R(0,6)  >  h(0o,Ao,p), 
where  h(Q0,Ao<p)  is  as  defined  in  (\). 


Proof: 

We  know  that  inf  g(x)  —  0  and  sup g(x)  =  1.  Let  S  £  V,.  Fix  e  >  0,  and  get  a  and  b 
such  that  g(a)  =  ei  ,  g(b)  =  1  —  e2  and  0  <  ex  -f  e2  <  t. 

Let  P{  be  the  probability  measure  induced  by  a  distribution  function  F,.  //  Define 


Fo 


F< 


PMb})  =  pr,  F, ({«})  =  1-p, 

> 

0  <  Pi  <  1 ;  for  i  =  1 , 2 . . . ,  k 


(7) 


If  F  =  (Fi,  F2, . . . ,  Fk)  £  F0,  then 


0i  =  0(Ft) 


=  Ci(l  -  pi)  +  (1  -  t2)pi 


=  €!+(!-£!-  e2)pi. 
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Therefore,  #(F;)  <  90  if  and  only  if 


9q  —  ci 

1  —  Cl  —  £2 


and  9(F{)  >  0o  +  A  if  and  only  if 


^  @o  —  ei  ,  A 
Pi  -  i - h  1 - • 

1  —  £l  —  £2  1  —  £l  —  £2 


For  i  =  1, 2, . . .  k  ,  define 


Tt  =  #{Xir.  XtJ=b  j  =  1, 2, . . . ,  n.) . 


Note  that  for  a  class  of  distribution  functions  To,  the  statistics  T  =  (Ti,  T?, . . . ,  Tk)  is  a 
complete  sufficient  statistics.  We  also  note  that  7\,  T2, . . . ,  Tk  are  independent  and  they 
have  binomial  distributions  with  parameters  (n,pi),  (n,p2),  •  •  •  >  (n,pfe)  respectively.  Since 


To  H  T'  C  T' ,  and 


we  have. 


mf ,5(»(F),«)>p  , 


.mf,S(0(F),<S)  >  j>. 


Also  the  binomial  family  possess  the  monotone  likelihood  ratio  property.  By  Theorem  1.2.2 


we  nave, 


sup  R{0{F),  6)  >  h(~~ — — — ,  - - - ,p). 

re^o  1  -  £1  -  £2  1  -  £1  -  £2 


And  since  T  J- 0  ,  we  have, 


sup  R(0(F),6)  >  sup  R(0{F),6) 

FeF  Fer0 

^  1 1  ^0  —  ei  A 

>  fc(. - ,- - ,p). 

1  —  £1  —  £2  1  —  £1  —  £2 
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Since  e  >  0  is  arbitrary,  letting  e 


0,  we  have 


9p  ~  Ci 
1  -  -  c2 

and 

A 


0q 


A. 


1  —  Cl  —  £2 

As  we  noticed  before  h  is  a  continuous  function,  letting  e 


0,  we  get 


sup  R(9(F),8)  >  h{90,  A,p).  (11) 

FeF 

This  completes  the  proof  of  the  lemma. 

Suppose  that  Xt  has  a  binomial  distribution  with  parameter  (n,0t),  for  i  =  1,2 
Let 


sup  R\{0,  S')  =  R,  (12) 

e 

where  8'  is  as  defined  in  (2). 

Lemma  2.2  If  8  €  £>  is  a  decision  rule  for  the  Problem  I  then 

sup  R\(9(F),  8)  >  R 

where  R  is  as  defined  in  (  12). 

Proof:- 

Fix  e  >  0  and  get  el5  e2  and  a,  6,  JF0  =  F0.(Cl,C2)  and  T  =  (TuT2 . Tk) 

as  in  Lemma  1.  Observe  that  T\,T2, . . .  ,Tk  are  independent  binomial  with  parameters 
{n,pi),(n,p2), . . .  ,{n,pk)  respectively.  Here  =  0{f,)  =  ei  +  (1  —  —  e2)pi  for  each  i, 
0  <  pi  <  1  if  and  only  if  <  9{  <  1  —  £2-  By  Theorem  1  if  6  is  any  decision  rule  then, 

sup  RX(9(F),6)  >  sup  Ri(9(F),80) 

F€F a  FeF0 
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where  60(x,a )  =  S'(T,a )  and  S'  as  defined  by  (2). 


Since 


we  have 


sup  R1(9{F),S)  >  sup  Ri(0(F),6), 

Fe?  FeF o 


sup  Ri{9(F),6)  >  sup  Ri(F,60)  =  i?(e,,£2)  say. 

FZ-F  ,£J) 

For  F  e  F o  , 

/?i(0(F),  <50)  =  £  £F^(r,a)  I(d(F),a). 

ag.4 

The  expectation  of  S'(T,a)  depends  on  F  only  through  0  =  0(F)  and  0  =  (ei  +  (l  — 
ei-e2)pi,er  +  (l-ei-€2)p2,-.-,ei  +  (l-ei-e2)Pfc)  — ♦  (pi,P2,  • .  •  ,Pk)  as  ei  +  e2  — ♦  0.  We 
know  that  for  every  a  £  A,  L(.,  a)  is  continuous  over  [0, 1]*,  hence  it  is  uniformly  continuous 
over  [0, 1]*.  We  have  R{<u(7)  — ♦  R  as  ti  +  e2  — ►  0.  Hence 

sup  Ri(F,6)  >  R,  (13) 

FZ.T 

and  this  completes  the  proof  of  the  lemma. 

Let  a  random  variavle  X  have  a  binomial  distribution  with  the  parameters  (n,0o  +  A). 
Let  c  =  c(p)  >  0  and  A  =  A(p)  €  [0,1)  such  that, 

P(X  >  c)  +  A  P(X  =  c)  =  p. 

Let  Z\,  Z2, . . .  Zk  be  the  independent  Bernoulli  random  variables  with  parameters 
Pi, P2,  Define, 


0(Pl,P2,---PJt)  =  P{Z i  +  Z2  ...Zk  >  c(p)) 


(14) 


+  A(p)  P(Z1  +  Z2...  +  Zfc  =  c(p)).  (15) 
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Let  Z  =  (Z 11,  Z12, . . . ,  Zi„, . . . ,  Zfci,  Zfc 2,  •  ■  ■  Zk„)  be  a  random  vector.  Components  of  Z 
for  A'  =  ( An,  A^i2,  ■  •  •  AAn,  ■  •  •  Afci,  Ajt2, . .  - ,  Ajt„)  are  independent  Bernoulli  with 

P(ZtJ  =  l\x)  =  g(Xt})  V  i,  j. 

Let  Ti  =  i  ;  T  =  (Ti,  T2, . .  .Tk)-  Let  8'  be  as  in  Theorem‘2.1  .  Define 

8"{x,a)  =  E{6'(T,a)\X  =  x).  (16) 

Theorem  2.3  : 

(i)  The  decision  rule  8"(., .)  =  8"(x,  a)  is  a  minimax  rule  for  a  class  of  loss  functions  defined 
for  the  Problem  I  . 

(ii)  Let  8i(x)  =  i8(g(xn),  g(xi2), . . .  ,g(xin))  ;  for  i  =  1,2. . . ,  k.  Then  the  decision  rule  8 ..  = 
(6i,  8%,  ...,8k)  ls  a  restricted  minimax  procedure  for  the  Problem  II  . 

Proof: 

(0- 

Ri(0(F),8") 

=  Ef  ^2  8"(X,a)L(0(F),a) 

a£A 

=  £lEF6"(X.a))He(F),a) 

aG  A 

=  Y.E{EF^\T,a\X  =  x)\L(e(F).a) 

a£A 

=  ^2  EF8'(T,a)L(8(F),a). 

a&A 

We  notice  that  7\,  T2, ...  ,Tk  are  independent  random  variables.  The  marginal  distribu¬ 
tion  of  Ti  is  binomial  with  the  parameters  ( n,0(Fi ))  for  i  =  1,2 . k. 

Hence 

sup  R\  ( F.  8")  =  sup  [Y  EF6\T,a)L(0,a)\ 
fer  o<&,<i  a£A 

=  R. 
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By  Lemma  1.2.2  the  result  follows. 


{ii)  . 

Observe  that  ^(X),  ^(-Y), . . .  <5fc(X)  are  independent. 

EF6i(X)  =  EFip{g{xiug(xi2),...,g{xtn)) 

=  P(Tt  >  c(p))  +  X(p)  P(Tt  =  c(p)). 

Here  7\,  T^, . . . ,  7*  are  independent  binomial  with  parameters  (n,  0(Fi)),  (n,  9{F2)), . . 
( n,0(Fk ))  respectively. 

For  F  €  T  , 

5(^(F),M.))  =  5(<?(F),^(.)), 

where 


=  and  T  =  (TuT2,...,Tk). 

The  decision  rule  6,  is  as  before  and 

R(O(F),6.m)  =  R(0(F),6,) 

By  the  choice  of  c  =  c(p)  and  A  =  A(p)  and  by  (11)  we  have 

sup  R{0(F),Smm)  =  sup  R($,6S) 
Fa?  e 

=  h(90,&,p) 

and 


inf 

Fa?' 


S{6 . 


«.) 


inf  S(9J,) 
saw 

P . 


and  this  completes  the  proof. 
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Remark  2.1 


Notice  that  the  minimax  procedures  we  have  established  are  randomized.  To  avoid  ran¬ 
domization,  for  example  in  Problem  I,  we  may  take  action  a  €  A  if  S'(x,a)  >  <5'(£,a')  for 
all  a'  €  A.  We  feel  that  this  procedure  may  be  approximate  minimax  and  will  be  more 
useful  in  practical  situations.  However  it  is  difficult  to  establish  analytic  properties  of  this 
approximated  procedure. 

3  Discussion  and  Examples 

In  this  section  we  consider  a  few  examples  (problems)  and  consider  the  Bayes  rules  with 
respect  to  particular  priors  and  see  how  different  they  are  from  the  minimax  rules. 

Example  3.1 

Nonparametric  Bayes  Procedures 

Let  0,  as  defined  before,  be  the  parameter  of  interest.  Let  us  suppose  that  we  are  interested 
in  selecting  population  LL  for  which  0,  is  largest.  Here 

Oi  =  0(Pt)  =  [  g(t)  dP,(t). 

Jx 

and  P{  is  a  probability  measure  corresponding  to  population  n,.  Let  for  each  i  =  1,2,...,/;, 
the  probability  measures  Pi,  P2,  •  •  • ,  Pk  are  independently,  identically  distributed  with  com¬ 
mon  Dirichlet  distribution  D(a).  The  probability  measure  P  on  X  is  said  to  have  Dirichlet 
distribution  with  parameter  a  if  for  any  k,  A\ ,  A 2, . . . ,  Ak  is  some  measurable  partition  of  X 

then  (P(Ai),  P(A2), .  ■ . ,  P(Ak))  has  Dirichlet  distribution  with  parameters  (o(.4i),  a(/l2), _ at  A(. ) ). 

See  Ferguson  (1973)  and  Ferguson  (1974)  for  more  discussion  and  for  relative  information. 

Let  the  loss  function  be  of  the  form 

L(9,  i)  =  maXjffj  —  0{. 
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That  is  the  loss  for  selecting  i  —  th  population  is  max:0j  —  0,.  Then  according  to  the  Bayes 
rule  we  select  the  population  IT  for  which  E(0(Pi)\X  =  x)  is  largest.  Following  Ferguson 
(1973)  we  know  that  the  posterior  distribution  of  Pt  follows  a  Dirichlet  distribution  with 
parameter  a  -f  nPn>{.  Hence  the  conditional  expectation  of  0,  is 

f  g  d(a  +  nPt,n). 

V  X 

Hence  E(0(Pi)\X  =  x)  is  largest  if  and  only  if  0,-  =  n~l  g(iij)  is  largest.  The  procedure 
we  have  established,  will  also  select  the  ith  population  with  high  probability  if  0,  is  large. 

Example  3.2 

Let  Xu,  Xi2, . . .  Xin  be  the  observable  random  vectors  from  the  population  n,  for  i  = 
1,2,  Let  Pi  be  the  probability  measure  generated  by  the  random  variable  Xu  .  Let 

0,  =  Pi(A)  be  the  parameter  of  interest.  Let  „Yj  =  (.Y,i,  .Y,2, . . .  -Ytn)  and 

T.(Xi)  =  £/<•(*,). 

;=l 

Then  according  to  the  theorem  above  the  procedure  which  ranks  population  n,  according  to 
the  rank  of  T,  in  T\,  T2, . . . ,  Tk  is  the  minimax  procedure  for  any  permutation  invariant  loss 
function.  Here  g{x)  =  Ia(x)  is  an  indicator  function  of  set  A. 

We  will  show  that  this  procedure  is  also  Bayes  procedure  when  P,  ,  i  =  1,2 are 
independently  identically  distributed  with  the  Dirichlet  prior.  To  see  this,  notice  that  the 
posterior  distribution  of  P,(A)  has  beta  distribution  with  parameter  p  =  a(A)  +  nPliU(A)  and 

7  =  c  —  p,  for  i  =  1,2 - -  k.  Where  c  =  a(X)  +  n,  is  a  fixed  constant.  We  also  notice  that  if 

a  random  variable  Y  has  beta  distribution  with  parameters  p  and  c  —  p  (  c  is  a  fixed  constant 
)  then  y  has  a  monotone  likelihood  ratio  in  p.  From  these  facts  it  is  straightforward  to  prove 
that  if  L  is  any  invariant  loss  function  then  the  rule  of  ranking  k  population  according  to 
ranks  of  T,  i  =  1,2 _ _  k,  is  a  Bayes  rule,  and  hence  it  is  admissible. 

The  same  argument  holds  for  the  (restricted)  subset  selection  procedures. 
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4  Testing 


Let  Xi,  X2, . .  • ,  Xn  be  observable  independent  random  vectors  with  a  common  distribution 
function  F  on  Rp . 

Let  9(F)  =  f  gdF  be  the  parameter  of  interest,  where  g  is  a  real- valued  bounded 
function  on  Rp  such  that  sup  g(x)  —  1  and  inf  g(x)  =  0.  We  will  construct  “optimal” 
tests  for  testing  the  hypothesis, 

Ho  :  6(F)  €  0o  Vs  Hx  :  6(F)  €  ©1  . 

Here  0i  =  0£  and  0O  is  of  the  form  {6  :  6  <  0O}  ,  {6  :  6  >  0o}  or  {6:6  =  60}  . 

Comparison  between  the  tests  of  level  a  is  made  on  the  basis  of  the  “power”  of  the  test. 
The  powrer  of  a  test  <f>0  at  F  for  6(F)  €  0i  ,  Pr(<f>o  Rejects  H0\F)  is  a  function  of  F 
and  not  6(F)  alone.  We  will  take  a  conservative  view  to  choose  the  test.  We  will  select  the 
test  of  level  a  ,  which  maximizes  the  minimum  power.  We  need  the  following  definitions. 

Definition  4.1  The  function  (3(6 )  =  00(6)  is  called  a  minimum  power  function  of  the  test 

<t>  if 

04,(6)  =  inf  Pr(f>  Rejects  H0\F) 

F:  9(F)=0 

Definition  4.2  The  test  <j>  of  the  level  a  is  said  to  be  the  least  uniformly  most  powerful 
test  (  LUMP  )  if  for  any  test  4>'  of  level  level  a, 

00(0)  >  0AO)  ye  €0i. 

Definition  4.3  The  test  4>  of  level  a  is  said  to  be  the  least  uniformly  most  powerful 
unbiased  test  (  LUMPU  )  of  level  a  if 

04,(8)  >  a  V  6  €  0 
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and  if  <j>'  is  any  other  test  of  level  a  with 

G  01, 

then 

W)  >/V(0)  V  0e  01. 

Let 


h(pi, p?, . . .  ,pn)  —  Pr[Z\  +  Zi  4-  . . .  +  Zn  >  c)  4-  A  P{Z\  4-  Zi  4-  . . .  4-  Zn  —  c), 

where  Z\,  Z?, . . . ,  Zn  are  independent  Bernolli  with  parameters  Pi,pi,  ■  ■  •  ,pn  respectively. 
A  =  A(a)  and  c  =  c(a)  are  chosen  such  that,  if  Z  is  a  binomial  random  variable  with 
parameters  (n,0o),then 

Pr{Z  >  c(a)  +  A (a)P(X  =  c(a))  =  a. 

Theorem  4.1  For  testing 

H0  :  6{F)  <  0O  Hx  :  0(F)  >  0O 

the  test  <j>(x)  =  h(g(xi), g(x2),  ■  ■ .  ,g(xn))  is  a  least  uniformly  most  powerful  test. 


Proof: 

Let  us  fix  an  arbitrarily  small  e  >  0,  get  a  ,6  €  BP  such  that  g(a)  =  ti  . g{b)  =  1  —  e2 
and  ci  +  f2  <  e  Let 


F(x)  =  0  V  x  <  a 

fa  =  '  F  :  =  1  —  p  V  a  E  [a,  6) 

=  1  Vx  >  b 
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For  the  class  Tq  ,T  =  #{A,  :  Xi  =  6}  is  a  sufficient  statistics  and  have  a  binomial 
distribution  with  the  parameters  ( n,p ),  where 

0(F)  =  ci(l-p)  +  (l-c2)p 

=  Cl  +  (1  —  Cl  —  C2  )p 

and  0  >  0O  if  and  only  if 

0o  —  £1 

p  >  7 — - — r  =  P(*..q)  say- 

For  the  class  Tq  the  UMP  test  is  4>i(T)  ,  where 

0i (T)  =  1  if  0  >  c(ci,c2) 

=  A(ci,c2)  ifT  =  c(ci,c2) 

=  0  if  T  <T  -  c(c  i,c2)  . 

The  constant  c  =  c(ci,c2)  and  A  =  A(ei,e2)  are  chosen  such  that,  Pr(X  >  c)  + 
A  P(X  =  c)  =  a  ,  where  A'  is  a  binomial  random  variable  with  parameters  (n, P(£lt£2)). 
Power  of  the  test  0\(T)  at  0  =  0i  >  0O  is  Pr(T  >  c(ei,  e2))  +  A(ei,  e2)  Pr(T  =  c(ei,  e2)).  Let 

M  ^  +  61  ,  — — - —  ,<*)  =  Pr(T  >  c(ei,e2))  +  Pr(T  =  c(ei,c2)). 

1  —  Ci  —  c2  1  —  Ci  —  e2 

So,  if  0'  is  any  test  of  level  a  ,  then 


f3<t>'(0\)  =  inf  Pr(<t>'  Rejects  Ho | F) 

F\9(F)=9 1 

<  inf  Pr( <t>  Rejects  Ho\F) 

~  0(F)=»,,F6  F0 

^  1  t  +  el  00  ~  £l  , 

<  «l(- - ,  - - ,«)• 

1  —  £1  —  c2  1  —  ci  —  e2 


We  know  that  h\  is  a  continuous  function,  letting  e  — »  0  ,  we  have 


/^><(0i)  <  h{9ue0,a) 

Pr(<t>  rejects  !^)  =  ^F^(-Yi),p(.Y2), .  • .  ,g{Xn)) 
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Pr[Z i  +  Z'i  -4-  -  •  •  4-  Zn  >  c) 

+  A  Pr{Z\  +  Z2  +  . . .  +  Zn  —  c). 


Here  Zj,  Z2, . . . ,  Zn  are  independent  Bernolli  with  common  parameter  6{F)  . 

Hence 

#<>(0i)  =  h(9i,90,a) 

that  is 

M0i)<fo(0i)  V0i>6o- 

This  proves  the  theorem. 
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